Goto

Collaborating Authors

 lf model


Generative AI-enhanced Probabilistic Multi-Fidelity Surrogate Modeling Via Transfer Learning

Zeng, Jice, Barajas-Solano, David, Chen, Hui

arXiv.org Machine Learning

The performance of machine learning surrogates is critically dependent on data quality and quantity. This presents a major challenge, as high-fidelity (HF) data is often scarce and computationally expensive to acquire, while low-fidelity (LF) data is abundant but less accurate. To address this data-scarcity problem, we develop a probabilistic multi-fidelity surrogate framework based on generative transfer learning. We employ a normalizing flow (NF) generative model as the backbone, which is trained in two phases: (i) the NF is first pretrained on a large LF dataset to learn a probabilistic forward model; (ii) the pretrained model is then fine-tuned on a small HF dataset, allowing it to correct for LF-HF discrepancies via knowledge transfer. To relax the dimension-preserving constraint of standard bijective NFs, we integrate surjective (dimension-reducing) layers with standard coupling blocks. This architecture enables learned dimension reduction while preserving the ability to train with exact likelihoods. The resulting surrogate provides fast probabilistic predictions with quantified uncertainty and significantly outperforms LF-only baselines while using fewer HF evaluations. We validate the approach on a reinforced concrete slab benchmark, combining many coarse-mesh (LF) simulations with a limited set of fine-mesh (HF) simulations. The proposed model achieves probabilistic predictions with HF accuracy, demonstrating a practical path toward data-efficient, generative AI-driven surrogates for complex engineering systems. Email address: David.Barajas-Solano@pnnl.gov (David Barajas-Solano) Introduction High-fidelity (HF) computer modeling using discretization schemes such as the finite elements (FE) method provides a rigorous framework for analyzing and predicting the behavior of complex engineering systems.


Adaptive Machine Learning-Driven Multi-Fidelity Stratified Sampling for Failure Analysis of Nonlinear Stochastic Systems

Xu, Liuyun, Spence, Seymour M. J.

arXiv.org Artificial Intelligence

Existing variance reduction techniques used in stochastic simulations for rare event analysis still require a substantial number of model evaluations to estimate small failure probabilities. In the context of complex, nonlinear finite element modeling environments, this can become computationally challenging-particularly for systems subjected to stochastic excitation. To address this challenge, a multi-fidelity stratified sampling scheme with adaptive machine learning metamodels is introduced for efficiently propagating uncertainties and estimating small failure probabilities. In this approach, a high-fidelity dataset generated through stratified sampling is used to train a deep learning-based metamodel, which then serves as a cost-effective and highly correlated low-fidelity model. An adaptive training scheme is proposed to balance the trade-off between approximation quality and computational demand associated with the development of the low-fidelity model. By integrating the low-fidelity outputs with additional high-fidelity results, an unbiased estimate of the strata-wise failure probabilities is obtained using a multi-fidelity Monte Carlo framework. The overall probability of failure is then computed using the total probability theorem. Application to a full-scale high-rise steel building subjected to stochastic wind excitation demonstrates that the proposed scheme can accurately estimate exceedance probability curves for nonlinear responses of interest, while achieving significant computational savings compared to single-fidelity variance reduction approaches.


Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

Yi, Jiaxiang, Cheng, Ji, Bessa, Miguel A.

arXiv.org Machine Learning

Multi-fidelity machine learning methods address the accuracy-efficiency trade-off by integrating scarce, resource-intensive high-fidelity data with abundant but less accurate low-fidelity data. We propose a practical multi-fidelity strategy for problems spanning low- and high-dimensional domains, integrating a non-probabilistic regression model for the low-fidelity with a Bayesian model for the high-fidelity. The models are trained in a staggered scheme, where the low-fidelity model is transfer-learned to the high-fidelity data and a Bayesian model is trained for the residual. This three-model strategy -- deterministic low-fidelity, transfer learning, and Bayesian residual -- leads to a prediction that includes uncertainty quantification both for noisy and noiseless multi-fidelity data. The strategy is general and unifies the topic, highlighting the expressivity trade-off between the transfer-learning and Bayesian models (a complex transfer-learning model leads to a simpler Bayesian model, and vice versa). We propose modeling choices for two scenarios, and argue in favor of using a linear transfer-learning model that fuses 1) kernel ridge regression for low-fidelity with Gaussian processes for high-fidelity; or 2) deep neural network for low-fidelity with a Bayesian neural network for high-fidelity. We demonstrate the effectiveness and efficiency of the proposed strategies and contrast them with the state-of-the-art based on various numerical examples. The simplicity of these formulations makes them practical for a broad scope of future engineering applications.


A Latent Variable Approach for Non-Hierarchical Multi-Fidelity Adaptive Sampling

Chen, Yi-Ping, Wang, Liwei, Comlek, Yigitcan, Chen, Wei

arXiv.org Machine Learning

Multi-fidelity (MF) methods are gaining popularity for enhancing surrogate modeling and design optimization by incorporating data from various low-fidelity (LF) models. While most existing MF methods assume a fixed dataset, adaptive sampling methods that dynamically allocate resources among fidelity models can achieve higher efficiency in the exploring and exploiting the design space. However, most existing MF methods rely on the hierarchical assumption of fidelity levels or fail to capture the intercorrelation between multiple fidelity levels and utilize it to quantify the value of the future samples and navigate the adaptive sampling. To address this hurdle, we propose a framework hinged on a latent embedding for different fidelity models and the associated pre-posterior analysis to explicitly utilize their correlation for adaptive sampling. In this framework, each infill sampling iteration includes two steps: We first identify the location of interest with the greatest potential improvement using the high-fidelity (HF) model, then we search for the next sample across all fidelity levels that maximize the improvement per unit cost at the location identified in the first step. This is made possible by a single Latent Variable Gaussian Process (LVGP) model that maps different fidelity models into an interpretable latent space to capture their correlations without assuming hierarchical fidelity levels. The LVGP enables us to assess how LF sampling candidates will affect HF response with pre-posterior analysis and determine the next sample with the best benefit-to-cost ratio. Through test cases, we demonstrate that the proposed method outperforms the benchmark methods in both MF global fitting (GF) and Bayesian Optimization (BO) problems in convergence rate and robustness. Moreover, the method offers the flexibility to switch between GF and BO by simply changing the acquisition function.


Accelerated and Inexpensive Machine Learning for Manufacturing Processes with Incomplete Mechanistic Knowledge

Cleeman, Jeremy, Agrawala, Kian, Malhotra, Rajiv

arXiv.org Artificial Intelligence

Machine Learning (ML) is of increasing interest for modeling parametric effects in manufacturing processes. But this approach is limited to established processes for which a deep physics-based understanding has been developed over time, since state-of-the-art approaches focus on reducing the experimental and/or computational costs of generating the training data but ignore the inherent and significant cost of developing qualitatively accurate physics-based models for new processes . This paper proposes a transfer learning based approach to address this issue, in which a ML model is trained on a large amount of computationally inexpensive data from a physics-based process model (source) and then fine-tuned on a smaller amount of costly experimental data (target). The novelty lies in pushing the boundaries of the qualitative accuracy demanded of the source model, which is assumed to be high in the literature, and is the root of the high model development cost. Our approach is evaluated for modeling the printed line width in Fused Filament Fabrication. Despite extreme functional and quantitative inaccuracies in the source our approach reduces the model development cost by years, experimental cost by 56-76%, computational cost by orders of magnitude, and prediction error by 16-24%.


General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulation

Chakroborty, Promit, Dhulipala, Somayajulu L. N., Che, Yifeng, Jiang, Wen, Spencer, Benjamin W., Hales, Jason D., Shields, Michael D.

arXiv.org Artificial Intelligence

Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity surrogate modeling strategy in which the multi-fidelity surrogate is assembled using an active learning strategy using an on-the-fly model adequacy assessment set within a subset simulation framework for efficient reliability analysis. The multi-fidelity surrogate is assembled by first applying a Gaussian process correction to each low-fidelity model and assigning a model probability based on the model's local predictive accuracy and cost. Three strategies are proposed to fuse these individual surrogates into an overall surrogate model based on model averaging and deterministic/stochastic model selection. The strategies also dictate which model evaluations are necessary. No assumptions are made about the relationships between low-fidelity models, while the high-fidelity model is assumed to be the most accurate and most computationally expensive model. Through two analytical and two numerical case studies, including a case study evaluating the failure probability of Tristructural isotropic-coated (TRISO) nuclear fuels, the algorithm is shown to be highly accurate while drastically reducing the number of high-fidelity model calls (and hence computational cost).


A Practical Second-order Latent Factor Model via Distributed Particle Swarm Optimization

Wang, Jialiang, Zhong, Yurong, Li, Weiling

arXiv.org Artificial Intelligence

Latent Factor (LF) models are effective in representing high-dimension and sparse (HiDS) data via low-rank matrices approximation. Hessian-free (HF) optimization is an efficient method to utilizing second-order information of an LF model's objective function and it has been utilized to optimize second-order LF (SLF) model. However, the low-rank representation ability of a SLF model heavily relies on its multiple hyperparameters. Determining these hyperparameters is time-consuming and it largely reduces the practicability of an SLF model. To address this issue, a practical SLF (PSLF) model is proposed in this work. It realizes hyperparameter self-adaptation with a distributed particle swarm optimizer (DPSO), which is gradient-free and parallelized. Experiments on real HiDS data sets indicate that PSLF model has a competitive advantage over state-of-the-art models in data representation ability.


Architectural Optimization and Feature Learning for High-Dimensional Time Series Datasets

Colgan, Robert E., Yan, Jingkai, Márka, Zsuzsa, Bartos, Imre, Márka, Szabolcs, Wright, John N.

arXiv.org Artificial Intelligence

As our ability to sense increases, we are experiencing a transition from data-poor problems, in which the central issue is a lack of relevant data, to data-rich problems, in which the central issue is to identify a few relevant features in a sea of observations. Motivated by applications in gravitational-wave astrophysics, we study the problem of predicting the presence of transient noise artifacts in a gravitational wave detector from a rich collection of measurements from the detector and its environment. We argue that feature learning--in which relevant features are optimized from data--is critical to achieving high accuracy. We introduce models that reduce the error rate by over 60% compared to the previous state of the art, which used fixed, hand-crafted features. Feature learning is useful not only because it improves performance on prediction tasks; the results provide valuable information about patterns associated with phenomena of interest that would otherwise be undiscoverable. In our application, features found to be associated with transient noise provide diagnostic information about its origin and suggest mitigation strategies. Learning in high-dimensional settings is challenging. Through experiments with a variety of architectures, we identify two key factors in successful models: sparsity, for selecting relevant variables within the high-dimensional observations; and depth, which confers flexibility for handling complex interactions and robustness with respect to temporal variations. We illustrate their significance through systematic experiments on real detector data. Our results provide experimental corroboration of common assumptions in the machine-learning community and have direct applicability to improving our ability to sense gravitational waves, as well as to many other problem settings with similarly high-dimensional, noisy, or partly irrelevant data.


Reliability Estimation of an Advanced Nuclear Fuel using Coupled Active Learning, Multifidelity Modeling, and Subset Simulation

Dhulipala, Somayajulu L. N., Shields, Michael D., Chakroborty, Promit, Jiang, Wen, Spencer, Benjamin W., Hales, Jason D., Laboure, Vincent M., Prince, Zachary M., Bolisetti, Chandrakanth, Che, Yifeng

arXiv.org Machine Learning

Tristructural isotropic (TRISO)-coated particle fuel is a robust nuclear fuel and determining its reliability is critical for the success of advanced nuclear technologies. However, TRISO failure probabilities are small and the associated computational models are expensive. We used coupled active learning, multifidelity modeling, and subset simulation to estimate the failure probabilities of TRISO fuels using several 1D and 2D models. With multifidelity modeling, we replaced expensive high-fidelity (HF) model evaluations with information fusion from two low-fidelity (LF) models. For the 1D TRISO models, we considered three multifidelity modeling strategies: only Kriging, Kriging LF prediction plus Kriging correction, and deep neural network (DNN) LF prediction plus Kriging correction. While the results across these multifidelity modeling strategies compared satisfactorily, strategies employing information fusion from two LF models consistently called the HF model least often. Next, for the 2D TRISO model, we considered two multifidelity modeling strategies: DNN LF prediction plus Kriging correction (data-driven) and 1D TRISO LF prediction plus Kriging correction (physics-based). The physics-based strategy, as expected, consistently required the fewest calls to the HF model. However, the data-driven strategy had a lower overall simulation time since the DNN predictions are instantaneous, and the 1D TRISO model requires a non-negligible simulation time.


ML Researchers Propose A Novel AI Model To Effectively Handle The Texture-Age Evolution

#artificialintelligence

A lifespan face synthesis model aims to create a set of photo-realistic images that show what someone's whole life would look like, given just one picture as a reference. The generated image is expected to be age-sensitive with realistic transformations in shape and texture while maintaining their identity. This task is challenging because faces undergo separate but highly nonlinear changes when it comes to how they change due to ageing; for example, skin loses elasticity which can make them appear wrinkled or saggy more quickly than other parts of your body might start changing. The latest LFS (lifespan face synthesis) models are based on the new generative adversarial networks that use conditional transformations to allow people's age code to be seen. They have significantly benefitted from recent advancements of GANs, and they're still improving every day.